Aminet 7

home *** CD-ROM | disk | FTP | other *** search

/ Aminet 7 / Aminet 7 - August 1995.iso / Aminet / docs / misc / ConcNews.lha / news / general.programming / comp.programming_1573_000023.msg < prev next >

Wrap

Internet Message Format | 1994-11-27 | 1.9 KB

Path: dd.chalmers.se!news.chalmers.se!sunic!EU.net!howland.reston.ans.net!europa.eng.gtefsd.com!library.ucla.edu!csulb.edu!paris.ics.uci.edu!news.claremont.edu!kaiwan.com!kaiwan!preston From: preston@kaiwan.com (Preston L. Bannister) Newsgroups: comp.programming Subject: Re: Hash function for strings Date: 3 Mar 1994 13:05:53 -0800 Organization: Upstanding Systems Lines: 41 Message-ID: <preston.762728626@kaiwan> References: <amundsj.1.0@novell.oih.no> NNTP-Posting-Host: kaiwan.kaiwan.com In <amundsj.1.0@novell.oih.no> amundsj@novell.oih.no (AMUNDSEN JARLE/3AA/D) writes: >I am trying to find a good hash function for strings, names that is, and >wonder if anyone has an idea on how good such a function can hash. Given >that the keys are not known in advance. >The keys I used in the test were all different, 1500 in all, and the size >of the hash table was 6001. At best, a hash function supplied a unique >adress to 75% of the keys, while the rest collided. I'm rather fond of a simple hash function published in the Communications of the ACM a few years back (sorry, no reference). Briefly the function is: char t[256]; /* contains the values 0..255 randomly shuffled */ char hash(const char* s) { char h = 0; while (*s) { h = t[h ^ *s++]; } return h; } The downside is that this only generates small hash keys (0..255). The upside is that hash function is _very_ cheap, and tends to generate a very uniform distribution of hash keys. Also, strings that hash to the same value tend to be very dissimilar, so searching a hash bucket (for instance) is quite fast as the strings miscompare in the first few characters. You can generate bigger hash keys by running the hash function more than once, with a different initial value for h, and concatenating the results. This can be faster than running a more complex hash function just once. -- Preston L. Bannister Upstanding Systems preston@kaiwan.com